Truly atomic deployments with NGINX and PHP-FPM
Notice: This solution only works if NGINX and PHP-FPM both reside on the same server.
A ‘git pull’ on a live server for deployments isn’t ideal since all files don’t change on the disk at the same millisecond. A request that starts on one version of the code might access other files which could be updated during the request. To get truly atomic deployments, PHP deployment applications like Capistrano have a symlink pointing to the current build and simply update the symlink to the new build once the new build folder is ready. Since Linux doesn’t itself have any disk cache, changing the symlink to point to the new build is atomic. At the same millisecond, all files in the symlinked folder now point to their newer versions.
This build process has some issues.
The setup:
1) NGINX(the web server) and PHP-FPM(PHP FastCGI process manager) both reside on the same server.
2) NGINX serves from the document root(/var/www/app.com) which is a symlink to the current build.
3) Visitor requests https://app.com/hello.php. Nginx proxies PHP-FPM and asks it to execute /var/www/app.com/hello.php. PHP-FPM returns output of the above script to NGINX and NGINX serves it back to the visitor.
4) PHP Opcache is a cache with maintains a mapping of (script path -> machine code translations). This cache prevents interpreting PHP code again and again and makes a lot of difference in performance.
5) PHP’s Realpath Cache is a cache containing path mappings for relative file includes within PHP scripts. It also makes a lot of difference in performance if a lot of ‘require/require_once’ and ‘include/include_once’ statements are used in the scripts.
The problem
The last step in the build process is changing the symlink. On changing the symlink, FPM was still executing PHP scripts from the old build folder.
Then came the realization. If NGINX tells PHP-FPM to execute /var/www/app.com/hello.php, Opcache would simply pick up the old machine translation for the script(since OpCache still gets the same path for the file) unless it’s cache is flushed.
Opcache itself consults Realpath Cache for some path mappings.
What if we could flush Opcache and Realpath cache
Even if caches are flushed for both of these on each build, there would be some milliseconds between flushing of the cache and the symlink change. In this time in between, due to incoming requests on the server, the cache would again start on building and flushing the cache would be of no use.
If the cache is flushed after making the symlink change, then old code would get executed in the time between flushing the cache and the symlink change.
What if PHP-FPM is reloaded
This itself had 2 issues:
a) Some requests could drop/not complete/fail in the process of reloading.
b) Again, there could be some milliseconds between making the symlink and start of reloading of FPM. Old code could get executed again in between that time since the symlink would already have been changed till then.
Therefore, this clearly wasn’t a solution.
What if NGINX serves from the new build folder(not the symlinked one) and is reloaded?
NGINX can serve from the new build folder than the symlinked folder, therefore, removing all caching issues.
The document root could be dynamically changed in the NGINX conf file on every build, and then NGINX could be reloaded.
This was a solution potentially worth looking into. Wanted to avoid reloading NGINX.
The solution
After some research, I found an easy to implement solution:
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
=>
fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
A simple change from $document_root to $realpath_root in NGINX configuration would make NGINX pass the actual script path(resolved after following symlinks). Since PHP-FPM now gets the actual(always new on each build) path of the script, all caching issues would go away since paths of all files will be new now.
The solution still doesn’t work
The solution didn’t work out. After enabling access logs for PHP FPM and debug logs for NGINX for finding out what exact path NGINX is passing in the request to FPM, it turns out the path being passed is something like /var/www/app.com/hello.php
This is the path with the symlink and not the absolute real path. So, why isn’t this working? Does my current version of NGINX support Realpath functionality? Upon some research, I found out that the Realpath functionality is enabled in NGINX from v0.8. I had version 1.4 on my environment, it should therefore work.
I commented the fastcgi_pass line in NGINX, just to make sure that this line is still being picked up. NGINX was reloaded and the app was still working. I couldn’t get how FPM is getting to know which file to execute. Then, I realized that this line could be getting overridden somewhere. And it turned out to be correct.
2 lines below, the NGINX conf read this: include /etc/nginx/fastcgi_params
This file has a list of commonly used fastcgi params passed to FPM. I removed the SCRIPT_FILENAME parameter from this file and reloaded NGINX and now NGINX was passing the real resolved path to FPM. Phew.
This could have been avoided if I had put my fastcgi_pass statement below the include statement since then my variable value would override the value in that file. Since my statement was above it, the variable in the file was overriding the one i had defined.
The new build process
1) Now the builds are fully atomic since a symlink change is enough to tell NGINX to serve from the new document root.
2) No need to reload NGINX or FPM after every build. Therefore, each request can be catered to.
3) NGINX takes care of passing the absolute resolved path of each PHP file to FPM, therefore, caching issues after a new build are no longer there. Therefore, no need to flush Opcache or Realpath cache after a new build.
4) If there are any issues on a new build, a symlink change to point to the previous build is enough :)
Cross-posted on https://kanishkdudeja.in/truly-atomic-deployments-with-nginx-and-php-fpm/